Semiconductor EU stocks

Overview

This project analyzes the stock markets of major European semiconductor companies. The goal of the project is to retrive financial data from yfinance and use it to forecast stock markets of the companies with time series analysis and machine learning. The results can be applied to trading and finacial decisionmaking. Note that this project itself does not provide such decisionmaking. It serves only as a general analysis and guideline.

Initially this project also featured an attempt at forecasting stock close prices with a hybrid LSTM-ARIMA model (inspired by this paper), but after many failed attempts, it was scrapped. The original paper doesn’t use the model for time series forecast, but instead for trend and buy/sell signal detection. There are many other examples of LSTM being used for stock data forecast, but for a very volatile market, it might not be the best fit.

Data cleansing

Code
import yfinance as yf
import random
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import seaborn as sns
import pandas as pd
import numpy as np
import plotly.graph_objects as go
import plotly.express as px
import time
from datetime import datetime, timedelta

tickers = [
    "ASML.AS", "NXPI", "IFX.DE", "BESI.AS",
    "NOD.OL", "MELE.BR", "AIXA.DE", "SMHN.DE", "AWEVF"
]
all_data = {}
yesterday = datetime.today() - timedelta(days=1)
yesterday_str = yesterday.strftime('%Y-%m-%d')

#Fetch data in relative time to get reliable results
for ticker in tickers:
    for attempt in range(3):
        try:
            stock = yf.Ticker(ticker)
            hist = stock.history(period="max", end=yesterday_str)
            if hist is None or hist.empty:
                display(f"No data for {ticker}, attempt {attempt+1}")
                time.sleep(2)
                continue
            all_data[ticker] = hist
            break
        except Exception as e:
            display(f"Error fetching {ticker}: {e}, attempt {attempt+1}")
            time.sleep(2)

#Check out ASML data as a test
if "ASML.AS" in all_data:
    display("ASML stocks tail")
    display(all_data["ASML.AS"].tail())
else:
    display("ASML.AS data not available")

#Clean and processed data for continuous time series
processed_data = {}

for ticker, df in all_data.items():
    if df.empty:
        continue
    df.index = df.index.tz_localize(None) 
    
    df_continuous = df.asfreq('D')
    
    cols_to_ffill = ['Open', 'High', 'Low', 'Close', 'Adj Close']
    existing_cols = [c for c in cols_to_ffill if c in df_continuous.columns]
    df_continuous[existing_cols] = df_continuous[existing_cols].ffill()
    
    if 'Volume' in df_continuous.columns:
        df_continuous['Volume'] = df_continuous['Volume'].fillna(0)
    
    processed_data[ticker] = df_continuous
'ASML stocks tail'
Open High Low Close Volume Dividends Stock Splits
Date
2026-02-09 00:00:00+01:00 1200.000000 1205.000000 1177.400024 1204.800049 456425 1.6 0.0
2026-02-10 00:00:00+01:00 1196.000000 1212.400024 1185.800049 1193.000000 459476 0.0 0.0
2026-02-11 00:00:00+01:00 1185.400024 1224.000000 1176.599976 1207.800049 530632 0.0 0.0
2026-02-12 00:00:00+01:00 1225.000000 1225.000000 1176.599976 1179.800049 558696 0.0 0.0
2026-02-13 00:00:00+01:00 1190.599976 1210.599976 1173.800049 1190.400024 708101 0.0 0.0

Line chart plot

After cleaning and processing the data, the next step is to visualize the stock markets in a clean line chart. Plotly offers some of the cleanest and most interactive visualization for this. There are downsides for using plotly however, the main ones being memory-heaviness and slowness. That is why it’s not recommended to use plotly for large data analytics.

Code
fig = go.Figure()
for ticker, data in processed_data.items():
    fig.add_trace(
        go.Scatter(
            x=data.index,
            y=data['Close'],
            mode='lines',
            name=f"{ticker} Close"
        )
    )
fig.update_layout(
    title="European Semiconductor Companies - Close Prices",
    xaxis_title="Time",
    yaxis_title="Close Price (€ or $ depending on listing)",
    legend_title="Company"
)

fig.show()
Figure 1: Time series line plot

Also line chart plot of last 500 days.

Code
fig = go.Figure()
for ticker, data in processed_data.items():
    data = data.tail(500)
    print(data['Close'].min(), data['Close'].max())
    fig.add_trace(
        go.Scatter(
            x=data.index,
            y=data['Close'],
            mode='lines',
            name=f"{ticker} Close"
        )
    )

fig.update_layout(
    title="European Semiconductor Companies - Close Prices of last 500 days",
    xaxis_title="Time",
    yaxis_title="Close Price (€ or $ depending on listing)",
    legend_title="Company"
)

fig.show()
545.144287109375 1223.158447265625
151.41038513183594 249.75
24.344999313354492 43.5099983215332
77.80705261230469 176.1999969482422
92.66000366210938 169.10000610351562
41.524349212646484 74.73664093017578
9.099449157714844 22.770000457763672
24.040000915527344 69.95320892333984
1.100000023841858 2.9700000286102295
Figure 2: Time series line plot

MACD analysis

Next is the analysis of MACD. MACD (Moving Average Convergence Divergence) is a commonly used test in financial statistics and trading. It reveals general trends in the stocks for buying and selling. It’s a really important step in stock market analysis. It’s recommended to zoom in the plot to see the MACD results and candlestick plot better.

Code
from plotly.subplots import make_subplots

for ticker, data in all_data.items():
    
    data['EMA12'] = data['Close'].ewm(span=12, adjust=False).mean()

    
    data['EMA26'] = data['Close'].ewm(span=26, adjust=False).mean()

    
    data['MACD'] = data['EMA12'] - data['EMA26']

    
    data['Signal_Line'] = data['MACD'].ewm(span=9, adjust=False).mean()

    
    last_row = data.iloc[-1]
    second_last_row = data.iloc[-2]

    if second_last_row['MACD'] > second_last_row['Signal_Line'] and last_row['MACD'] < last_row['Signal_Line']:
        print(f'{ticker}: Cross Below Signal Line → Potential Bearish Signal')
    elif second_last_row['MACD'] < second_last_row['Signal_Line'] and last_row['MACD'] > last_row['Signal_Line']:
        print(f'{ticker}: Cross Above Signal Line → Potential Bullish Signal')

    #Print the market trends first
    else:
        
        if last_row['MACD'] > last_row['Signal_Line']:
            trend = 'Bullish Trend'
        elif last_row['MACD'] < last_row['Signal_Line']:
            trend = 'Bearish Trend'
        else:
            trend = 'Neutral / Flat'
        print(f'{ticker}: No Crossover → {trend}')

for ticker, data in all_data.items():
    fig = make_subplots(
        rows=2, cols=1,
        shared_xaxes=True,
        vertical_spacing=0.1,
        row_heights=[0.7, 0.3],
        subplot_titles=(f'{ticker} Price', 'MACD')
    )

    
    fig.add_trace(go.Candlestick(
        x=data.index,
        open=data['Open'],
        high=data['High'],
        low=data['Low'],
        close=data['Close'],
        name='Price'
    ), row=1, col=1)

    
    fig.add_trace(go.Scatter(
        x=data.index,
        y=data['MACD'],
        mode='lines',
        name='MACD',
        line=dict(color='green')
    ), row=2, col=1)

    
    fig.add_trace(go.Scatter(
        x=data.index,
        y=data['Signal_Line'],
        mode='lines',
        name='Signal Line',
        line=dict(color='red')
    ), row=2, col=1)

    
    macd_hist = data['MACD'] - data['Signal_Line']
    fig.add_trace(go.Bar(
        x=data.index,
        y=macd_hist,
        name='MACD Histogram',
        marker_color=['green' if val >= 0 else 'red' for val in macd_hist],
        opacity=0.6
    ), row=2, col=1)

    
    fig.update_layout(
        title=f'{ticker} Candlestick & MACD',
        xaxis_rangeslider_visible=False,
        legend=dict(x=0, y=1.15, orientation='h'),
        height=700
    )

    fig.show()
ASML.AS: No Crossover → Bearish Trend
NXPI: No Crossover → Bullish Trend
IFX.DE: Cross Above Signal Line → Potential Bullish Signal
BESI.AS: No Crossover → Bearish Trend
NOD.OL: No Crossover → Bullish Trend
MELE.BR: No Crossover → Bearish Trend
AIXA.DE: No Crossover → Bullish Trend
SMHN.DE: No Crossover → Bearish Trend
AWEVF: No Crossover → Bullish Trend
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
Figure 3

RSI analysis

The next technical indicator analysis is RSI (Relative strength index). The indicator helps to indentify the oveerbought and oversold trends and buy and sell signals. Using both RSI and MACD is the most optimal way to figure out stock market trends for trading.

Code
from ta.momentum import RSIIndicator

for ticker, data in all_data.items():
    close_values = data['Close']
     
    rsi_14 = RSIIndicator(close=close_values, window=14)
    rsi_series = rsi_14.rsi()
    
    fig = go.Figure()
    
    fig.add_trace(go.Scatter(
        x=close_values.index, 
        y=rsi_series, 
        mode='lines', 
        name=f'{ticker} RSI'
    ))
    
    fig.add_hline(y=70, line_dash="dash", line_color="red", annotation_text="Overbought")
    fig.add_hline(y=30, line_dash="dash", line_color="green", annotation_text="Oversold")
    
    fig.update_layout(
        title=f"RSI (14) for {ticker}",
        xaxis_title="Date",
        yaxis_title="RSI",
        yaxis=dict(range=[0, 100])
    )
    
    fig.show()
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
Figure 4

GARCH model

The GARCH (Generalized Autoregressive Conditional Heteroscedasticity) model is a popular statistical model for time series analysis, especially in trading and quantitative finance. The main application of ARCH in finance is to examine and forecast the market volatility. This is especially important for volatile and risk-averse markets like semiconductor market. The Q-Q-plot is an important sanity check for the market data. From the Q-Q-plot you can tell if the data aligns with a standard probablity distribution. Straight line means aligning with the distribution.

Code
import statsmodels.api as sm
import matplotlib.pyplot as plt
from arch import arch_model
import datetime as dt

for ticker, data in all_data.items():
    
    returns = 100 * data['Close'].pct_change().dropna()
    
    # Q-Q Plot for Normality Check
    sm.qqplot(returns, line='s')
    plt.title(f'{ticker} Returns Q-Q Plot')
    plt.show()

    split_date = dt.datetime(2026, 2, 11) 

    am = arch_model(returns, vol='Garch', p=1, q=1, dist='normal')
    res = am.fit(update_freq=5, disp='off', last_obs=split_date,options={'ftol': 1e-4})
    display(res.summary())
    
    # Residuals vs conditional volatility
    fig = res.plot(annualize="D")
    plt.show()

    
    fixed_res = am.fix([0.0235, 0.01, 0.06, 0.0]) #fix with parameters
    display("Fixed results:")
    display(fixed_res.summary())

    #Compare Volatility Estimates
    df_vol = pd.concat([res.conditional_volatility, fixed_res.conditional_volatility], axis=1)
    df_vol.columns = ["Estimated", "Fixed"]
    
    ax = df_vol.plot(figsize=(7, 4))
    ax.set_title(f"{ticker}: Estimated vs Fixed Volatility")
    plt.show()
    
    #Forecasting
    forecasts = res.forecast(horizon=5, align='origin')
    display(forecasts.variance.dropna().head())
    
    forecast_vol = res.conditional_volatility * np.sqrt(252)


    forecast_var = forecasts.variance.iloc[-1]
    forecast_vol_annual = np.sqrt(forecast_var) * np.sqrt(252)

    cond_vol_annual = res.conditional_volatility * np.sqrt(252)
    realized_vol_annual = returns.rolling(window=5).std() * np.sqrt(252)

    forecast_dates = pd.date_range(start=split_date + pd.Timedelta(days=1), periods=5, freq='B')
    forecast_series = pd.Series(forecast_vol_annual.values, index=forecast_dates) #for plotting

    plt.figure(figsize=(12, 6))
    
    plot_start = returns.index[max(0, len(returns)-100)]
    
    plt.plot(realized_vol_annual.loc[plot_start:], label='Realized Vol (5-day)', color='gray', alpha=0.3)
    plt.plot(cond_vol_annual.loc[plot_start:split_date], label='In-Sample GARCH', color='blue')
    
    plt.plot(forecast_series, label='5-Day Out-of-Sample Forecast', color='red', marker='o', markersize=5)
    
    plt.axvline(split_date, color='black', linestyle='--', alpha=0.5)
    plt.title(f'Volatility Forecast: {ticker}')
    plt.ylabel('Annualized Volatility (%)')
    plt.legend()
    plt.grid(True, alpha=0.2)
    plt.show()
(a)
Dep. Variable: Close R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: GARCH Log-Likelihood: -18034.3
Distribution: Normal AIC: 36076.5
Method: Maximum Likelihood BIC: 36104.0
No. Observations: 7093
Date: Mon, Feb 16 2026 Df Residuals: 7092
Time: 14:52:27 Df Model: 1
coef std err t P>|t| 95.0% Conf. Int.
mu -8.1747e-03 2.837e-02 -0.288 0.773 [-6.378e-02,4.743e-02]
coef std err t P>|t| 95.0% Conf. Int.
omega 0.0892 1.241 7.193e-02 0.943 [ -2.342, 2.521]
alpha[1] 0.0587 7.791e-02 0.753 0.451 [-9.402e-02, 0.211]
beta[1] 0.9413 0.256 3.673 2.396e-04 [ 0.439, 1.444]


Covariance estimator: robust
(b)
(c)
'Fixed results:'
(d)
Dep. Variable: Close R-squared: --
Mean Model: Constant Mean Adj. R-squared: --
Vol Model: GARCH Log-Likelihood: -2.68147e+06
Distribution: Normal AIC: 5.36296e+06
Method: User-specified Parameters BIC: 5.36298e+06
No. Observations: 7096
Date: Mon, Feb 16 2026
Time: 14:52:27
coef
mu 0.0235
coef
omega 0.0100
alpha[1] 0.0600
beta[1] 0.0000


Results generated with user-specified parameters.
Std. errors not available when the model is not estimated,
(e)
(f)
h.1 h.2 h.3 h.4 h.5
Date
2026-02-10 8.343298 8.432530 8.521762 8.610994 8.700226
2026-02-11 8.034492 8.123724 8.212956 8.302188 8.391420
2026-02-12 7.965423 8.054655 8.143887 8.233119 8.322351
2026-02-13 7.635522 7.724754 7.813986 7.903218 7.992450
(g)
(h)
(i)
Dep. Variable: Close R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: GARCH Log-Likelihood: -8841.68
Distribution: Normal AIC: 17691.4
Method: Maximum Likelihood BIC: 17716.4
No. Observations: 3901
Date: Mon, Feb 16 2026 Df Residuals: 3900
Time: 14:52:28 Df Model: 1
coef std err t P>|t| 95.0% Conf. Int.
mu 0.1002 2.611e-02 3.837 1.247e-04 [4.901e-02, 0.151]
coef std err t P>|t| 95.0% Conf. Int.
omega 8.9343e-03 6.623e-03 1.349 0.177 [-4.046e-03,2.191e-02]
alpha[1] 0.0722 1.227e-02 5.883 4.019e-09 [4.814e-02,9.624e-02]
beta[1] 0.9278 1.426e-02 65.059 0.000 [ 0.900, 0.956]


Covariance estimator: robust
(j)
(k)
'Fixed results:'
(l)
Dep. Variable: Close R-squared: --
Mean Model: Constant Mean Adj. R-squared: --
Vol Model: GARCH Log-Likelihood: -243490.
Distribution: Normal AIC: 486989.
Method: User-specified Parameters BIC: 487014.
No. Observations: 3904
Date: Mon, Feb 16 2026
Time: 14:52:29
coef
mu 0.0235
coef
omega 0.0100
alpha[1] 0.0600
beta[1] 0.0000


Results generated with user-specified parameters.
Std. errors not available when the model is not estimated,
(m)
(n)
h.1 h.2 h.3 h.4 h.5
Date
2026-02-10 6.912220 6.921321 6.930421 6.939523 6.948624
2026-02-11 8.565670 8.574810 8.583951 8.593091 8.602232
2026-02-12 8.662438 8.671581 8.680723 8.689867 8.699010
2026-02-13 8.095315 8.104444 8.113573 8.122703 8.131832
(o)
(p)
(q)
Dep. Variable: Close R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: GARCH Log-Likelihood: -15452.4
Distribution: Normal AIC: 30912.8
Method: Maximum Likelihood BIC: 30940.0
No. Observations: 6621
Date: Mon, Feb 16 2026 Df Residuals: 6620
Time: 14:52:30 Df Model: 1
coef std err t P>|t| 95.0% Conf. Int.
mu 0.0881 2.783e-02 3.168 1.537e-03 [3.361e-02, 0.143]
coef std err t P>|t| 95.0% Conf. Int.
omega 0.0664 2.557e-02 2.595 9.458e-03 [1.624e-02, 0.116]
alpha[1] 0.0540 1.119e-02 4.826 1.396e-06 [3.208e-02,7.596e-02]
beta[1] 0.9370 1.325e-02 70.709 0.000 [ 0.911, 0.963]


Covariance estimator: robust
(r)
(s)
'Fixed results:'
(t)
Dep. Variable: Close R-squared: --
Mean Model: Constant Mean Adj. R-squared: --
Vol Model: GARCH Log-Likelihood: -502676.
Distribution: Normal AIC: 1.00536e+06
Method: User-specified Parameters BIC: 1.00539e+06
No. Observations: 6624
Date: Mon, Feb 16 2026
Time: 14:52:30
coef
mu 0.0235
coef
omega 0.0100
alpha[1] 0.0600
beta[1] 0.0000


Results generated with user-specified parameters.
Std. errors not available when the model is not estimated,
(u)
(v)
h.1 h.2 h.3 h.4 h.5
Date
2026-02-10 4.625270 4.650100 4.674707 4.699094 4.723261
2026-02-11 4.508649 4.534527 4.560172 4.585586 4.610773
2026-02-12 4.472842 4.499041 4.525004 4.550735 4.576234
2026-02-13 4.411641 4.438389 4.464897 4.491167 4.517201
(w)
(x)
(y)
Dep. Variable: Close R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: GARCH Log-Likelihood: -16940.8
Distribution: Normal AIC: 33889.6
Method: Maximum Likelihood BIC: 33917.1
No. Observations: 7093
Date: Mon, Feb 16 2026 Df Residuals: 7092
Time: 14:52:31 Df Model: 1
coef std err t P>|t| 95.0% Conf. Int.
mu 0.1554 2.890e-02 5.377 7.583e-08 [9.874e-02, 0.212]
coef std err t P>|t| 95.0% Conf. Int.
omega 0.0700 2.654e-02 2.640 8.300e-03 [1.804e-02, 0.122]
alpha[1] 0.0483 9.928e-03 4.865 1.146e-06 [2.884e-02,6.775e-02]
beta[1] 0.9446 1.167e-02 80.912 0.000 [ 0.922, 0.967]


Covariance estimator: robust
(z)
({)
'Fixed results:'
(|)
Dep. Variable: Close R-squared: --
Mean Model: Constant Mean Adj. R-squared: --
Vol Model: GARCH Log-Likelihood: -578098.
Distribution: Normal AIC: 1.15620e+06
Method: User-specified Parameters BIC: 1.15623e+06
No. Observations: 7096
Date: Mon, Feb 16 2026
Time: 14:52:32
coef
mu 0.0235
coef
omega 0.0100
alpha[1] 0.0600
beta[1] 0.0000


Results generated with user-specified parameters.
Std. errors not available when the model is not estimated,
(})
(~)
h.1 h.2 h.3 h.4 h.5
Date
2026-02-10 8.258731 8.270012 8.281212 8.292332 8.303374
2026-02-11 7.913731 7.927467 7.941104 7.954645 7.968089
2026-02-12 7.645788 7.661430 7.676960 7.692381 7.707691
2026-02-13 8.181166 8.192998 8.204747 8.216411 8.227993
()
(€)
()
Dep. Variable: Close R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: GARCH Log-Likelihood: -17462.0
Distribution: Normal AIC: 34931.9
Method: Maximum Likelihood BIC: 34959.1
No. Observations: 6620
Date: Mon, Feb 16 2026 Df Residuals: 6619
Time: 14:52:33 Df Model: 1
coef std err t P>|t| 95.0% Conf. Int.
mu 0.1177 4.077e-02 2.887 3.892e-03 [3.779e-02, 0.198]
coef std err t P>|t| 95.0% Conf. Int.
omega 1.1929 0.892 1.337 0.181 [ -0.556, 2.941]
alpha[1] 0.0587 3.394e-02 1.730 8.364e-02 [-7.806e-03, 0.125]
beta[1] 0.8456 9.160e-02 9.231 2.677e-20 [ 0.666, 1.025]


Covariance estimator: robust
(‚)
(ƒ)
'Fixed results:'
(„)
Dep. Variable: Close R-squared: --
Mean Model: Constant Mean Adj. R-squared: --
Vol Model: GARCH Log-Likelihood: -2.62349e+06
Distribution: Normal AIC: 5.24699e+06
Method: User-specified Parameters BIC: 5.24702e+06
No. Observations: 6623
Date: Mon, Feb 16 2026
Time: 14:52:33
coef
mu 0.0235
coef
omega 0.0100
alpha[1] 0.0600
beta[1] 0.0000


Results generated with user-specified parameters.
Std. errors not available when the model is not estimated,
(…)
(†)
h.1 h.2 h.3 h.4 h.5
Date
2026-02-10 16.765401 16.353632 15.981273 15.644555 15.340064
2026-02-11 15.477568 15.189060 14.928166 14.692242 14.478899
2026-02-12 14.478290 14.285425 14.111019 13.953307 13.810689
2026-02-13 14.618438 14.412159 14.225624 14.056942 13.904405
(‡)
(ˆ)
(‰)
Dep. Variable: Close R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: GARCH Log-Likelihood: -12943.7
Distribution: Normal AIC: 25895.3
Method: Maximum Likelihood BIC: 25922.2
No. Observations: 6091
Date: Mon, Feb 16 2026 Df Residuals: 6090
Time: 14:52:34 Df Model: 1
coef std err t P>|t| 95.0% Conf. Int.
mu 0.1293 2.716e-02 4.761 1.931e-06 [7.607e-02, 0.183]
coef std err t P>|t| 95.0% Conf. Int.
omega 0.1547 8.297e-02 1.864 6.234e-02 [-7.973e-03, 0.317]
alpha[1] 0.0836 2.840e-02 2.942 3.262e-03 [2.789e-02, 0.139]
beta[1] 0.8868 4.154e-02 21.347 4.173e-101 [ 0.805, 0.968]


Covariance estimator: robust
(Š)
(‹)
'Fixed results:'
(Œ)
Dep. Variable: Close R-squared: --
Mean Model: Constant Mean Adj. R-squared: --
Vol Model: GARCH Log-Likelihood: -293220.
Distribution: Normal AIC: 586447.
Method: User-specified Parameters BIC: 586474.
No. Observations: 6094
Date: Mon, Feb 16 2026
Time: 14:52:34
coef
mu 0.0235
coef
omega 0.0100
alpha[1] 0.0600
beta[1] 0.0000


Results generated with user-specified parameters.
Std. errors not available when the model is not estimated,
()
(Ž)
h.1 h.2 h.3 h.4 h.5
Date
2026-02-10 11.456504 11.271090 11.091180 10.916611 10.747223
2026-02-11 10.554558 10.395918 10.241986 10.092623 9.947694
2026-02-12 9.737066 9.602691 9.472304 9.345789 9.223028
2026-02-13 8.818165 8.711066 8.607147 8.506312 8.408470
()
()
(‘)
Dep. Variable: Close R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: GARCH Log-Likelihood: -18540.8
Distribution: Normal AIC: 37089.6
Method: Maximum Likelihood BIC: 37117.0
No. Observations: 6971
Date: Mon, Feb 16 2026 Df Residuals: 6970
Time: 14:52:35 Df Model: 1
coef std err t P>|t| 95.0% Conf. Int.
mu 0.0638 4.071e-02 1.567 0.117 [-1.599e-02, 0.144]
coef std err t P>|t| 95.0% Conf. Int.
omega 0.0851 6.370e-02 1.336 0.182 [-3.975e-02, 0.210]
alpha[1] 0.0276 1.085e-02 2.547 1.087e-02 [6.372e-03,4.892e-02]
beta[1] 0.9669 1.469e-02 65.833 0.000 [ 0.938, 0.996]


Covariance estimator: robust
(’)
(“)
'Fixed results:'
(”)
Dep. Variable: Close R-squared: --
Mean Model: Constant Mean Adj. R-squared: --
Vol Model: GARCH Log-Likelihood: -730736.
Distribution: Normal AIC: 1.46148e+06
Method: User-specified Parameters BIC: 1.46151e+06
No. Observations: 6974
Date: Mon, Feb 16 2026
Time: 14:52:35
coef
mu 0.0235
coef
omega 0.0100
alpha[1] 0.0600
beta[1] 0.0000


Results generated with user-specified parameters.
Std. errors not available when the model is not estimated,
(•)
(–)
h.1 h.2 h.3 h.4 h.5
Date
2026-02-10 15.372814 15.374587 15.376351 15.378105 15.379850
2026-02-11 15.007428 15.011183 15.014916 15.018630 15.022324
2026-02-12 14.901056 14.905387 14.909694 14.913978 14.918239
2026-02-13 15.621520 15.621945 15.622368 15.622788 15.623207
(—)
(˜)
(™)
Dep. Variable: Close R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: GARCH Log-Likelihood: -18235.1
Distribution: Normal AIC: 36478.1
Method: Maximum Likelihood BIC: 36505.5
No. Observations: 6834
Date: Mon, Feb 16 2026 Df Residuals: 6833
Time: 14:52:36 Df Model: 1
coef std err t P>|t| 95.0% Conf. Int.
mu 0.1177 4.038e-02 2.913 3.575e-03 [3.850e-02, 0.197]
coef std err t P>|t| 95.0% Conf. Int.
omega 0.1294 9.690e-02 1.335 0.182 [-6.054e-02, 0.319]
alpha[1] 0.0407 1.935e-02 2.105 3.533e-02 [2.798e-03,7.865e-02]
beta[1] 0.9509 2.491e-02 38.167 0.000 [ 0.902, 1.000]


Covariance estimator: robust
(š)
(›)
'Fixed results:'
(œ)
Dep. Variable: Close R-squared: --
Mean Model: Constant Mean Adj. R-squared: --
Vol Model: GARCH Log-Likelihood: -683726.
Distribution: Normal AIC: 1.36746e+06
Method: User-specified Parameters BIC: 1.36749e+06
No. Observations: 6837
Date: Mon, Feb 16 2026
Time: 14:52:36
coef
mu 0.0235
coef
omega 0.0100
alpha[1] 0.0600
beta[1] 0.0000


Results generated with user-specified parameters.
Std. errors not available when the model is not estimated,
()
(ž)
h.1 h.2 h.3 h.4 h.5
Date
2026-02-10 15.527729 15.526322 15.524927 15.523544 15.522172
2026-02-11 14.966844 14.970162 14.973451 14.976713 14.979947
2026-02-12 15.737396 15.734223 15.731077 15.727957 15.724864
2026-02-13 15.601637 15.599608 15.597596 15.595600 15.593622
(Ÿ)
( )
(¡)
Dep. Variable: Close R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: GARCH Log-Likelihood: -1144.35
Distribution: Normal AIC: 2296.69
Method: Maximum Likelihood BIC: 2312.37
No. Observations: 372
Date: Mon, Feb 16 2026 Df Residuals: 371
Time: 14:52:37 Df Model: 1
coef std err t P>|t| 95.0% Conf. Int.
mu 0.4701 4.393 0.107 0.915 [ -8.141, 9.081]
coef std err t P>|t| 95.0% Conf. Int.
omega 3.6103 203.967 1.770e-02 0.986 [-3.962e+02,4.034e+02]
alpha[1] 0.3482 7.219 4.823e-02 0.962 [-13.801, 14.497]
beta[1] 0.6518 4.283 0.152 0.879 [ -7.742, 9.046]


Covariance estimator: robust
(¢)
(£)
'Fixed results:'
(¤)
Dep. Variable: Close R-squared: --
Mean Model: Constant Mean Adj. R-squared: --
Vol Model: GARCH Log-Likelihood: -125551.
Distribution: Normal AIC: 251110.
Method: User-specified Parameters BIC: 251126.
No. Observations: 375
Date: Mon, Feb 16 2026
Time: 14:52:37
coef
mu 0.0235
coef
omega 0.0100
alpha[1] 0.0600
beta[1] 0.0000


Results generated with user-specified parameters.
Std. errors not available when the model is not estimated,
(¥)
(¦)
h.1 h.2 h.3 h.4 h.5
Date
2026-02-10 10.592766 14.203046 17.813326 21.423607 25.033887
2026-02-11 10.591889 14.202169 17.812450 21.422730 25.033010
2026-02-12 10.591318 14.201598 17.811878 21.422159 25.032439
2026-02-13 10.590945 14.201226 17.811506 21.421786 25.032066
(§)
(¨)
Figure 5: Volatility Model

Statistical checks

The next analysis checks the skewness and mode of the stock market data among other statistical measures. These are important for detailed understanding of the stock markets. Some analysts have debated that positive skewness is a good indicator for buying. The other statistical measures such as mean and standard deviation are also important for stock market analysis and can help with buy/sell decisions.

Code
from scipy.stats import skew, mode

for ticker, data in all_data.items():
    close_values = data['Close']
    close_skewness = skew(close_values)
    close_mean = np.mean(close_values)
    close_std = np.std(close_values)
    close_median = np.median(close_values)
    
    display(f"Skewness of {ticker}:", close_skewness)

    pearson_skewness = (3 * (close_mean - close_median)) / close_std
    display(f"Pearson's Second Skewnes of {ticker}:", pearson_skewness)
    mode_val = close_values.mode().iloc[0]

    plt.figure(figsize=(8, 5))
    sns.kdeplot(close_values)
    plt.axvline(close_mean, label="Mean")
    plt.axvline(close_median, color="black", label="Median")
    plt.axvline(mode_val, color="green", label="Mode")
    plt.title(f"Distribution of {ticker} Close Prices (Skewness)")
    plt.xlabel("Price")
    plt.legend()
    plt.show()
'Skewness of ASML.AS:'
(a)
1.725841310691759
(b)
"Pearson's Second Skewnes of ASML.AS:"
(c)
1.488635726001992
(d)
(e)
'Skewness of NXPI:'
(f)
0.43628742157245304
(g)
"Pearson's Second Skewnes of NXPI:"
(h)
0.7078727084879803
(i)
(j)
'Skewness of IFX.DE:'
(k)
1.01038828468114
(l)
"Pearson's Second Skewnes of IFX.DE:"
(m)
1.3985482157427367
(n)
(o)
'Skewness of BESI.AS:'
(p)
2.0837344454691125
(q)
"Pearson's Second Skewnes of BESI.AS:"
(r)
1.544777113586909
(s)
(t)
'Skewness of NOD.OL:'
(u)
1.8909029781027362
(v)
"Pearson's Second Skewnes of NOD.OL:"
(w)
1.3279410677139278
(x)
(y)
'Skewness of MELE.BR:'
(z)
0.46158505120667875
({)
"Pearson's Second Skewnes of MELE.BR:"
(|)
1.3161214636273015
(})
(~)
'Skewness of AIXA.DE:'
()
3.2073192928040757
(€)
"Pearson's Second Skewnes of AIXA.DE:"
()
0.9586018786250885
(‚)
(ƒ)
'Skewness of SMHN.DE:'
(„)
1.6805232399960512
(…)
"Pearson's Second Skewnes of SMHN.DE:"
(†)
1.2913957734430643
(‡)
(ˆ)
'Skewness of AWEVF:'
(‰)
0.1577176152108814
(Š)
"Pearson's Second Skewnes of AWEVF:"
(‹)
0.5724640039028525
(Œ)
()
Figure 6

XGBoost

XGBoost is the first of the ML models used in this project. XGBoost is one of the most popular gradient boosting implementations and fits expectionally well when analyzing time series data. XGBoost is a quite complicated model, so it’s easier to understand the results rather than the model itself. The time series line plot for these models includes only 200 days of historical data for easier visualization.

Code
import xgboost as xgb
from sklearn.metrics import mean_squared_error

colors = px.colors.qualitative.Alphabet

#First it's important to go through the data and separate each feature for training
def create_features(df, label=None):
    df = df.copy()
    df['date'] = df.index
    df['date'] = pd.to_datetime(df['date'])
    df['hour'] = df['date'].dt.hour
    df['dayofweek'] = df['date'].dt.dayofweek
    df['quarter'] = df['date'].dt.quarter
    df['month'] = df['date'].dt.month
    df['year'] = df['date'].dt.year
    df['dayofyear'] = df['date'].dt.dayofyear
    df['dayofmonth'] = df['date'].dt.day
    df['weekofyear'] = df['date'].dt.isocalendar().week

    X = df[['hour','dayofweek','quarter','month','year',
           'dayofyear','dayofmonth','weekofyear']]
    if label:
        y = df[label]
        return X, y
    return X

fig = go.Figure()
for i, (ticker, data) in enumerate(processed_data.items()):
    current_color = colors[i % len(colors)]
    data = data.sort_index()
    split_date = '10-Feb-2026' 
    stock_train = data.loc[data.index <= split_date].copy()
    stock_test = data.loc[data.index > split_date].copy()

    X_train, y_train = create_features(stock_train, label='Close')
    X_test, y_test = create_features(stock_test, label='Close')

    reg = xgb.XGBRegressor(n_estimators=1000, early_stopping_rounds=50)
    reg.fit(X_train, y_train,
            eval_set=[(X_train, y_train), (X_test, y_test)], verbose=False)

    forecast_periods = 50
    
    data_recent = data.tail(500).copy()
    data_recent.index = pd.to_datetime(data_recent.index)
    data_recent = data_recent.sort_index()

    hist_x = data_recent.index
    future_start = hist_x[-1] + pd.Timedelta(days=1)
    future_dates = pd.date_range(start=future_start, periods=forecast_periods, freq='B')
    future_df = pd.DataFrame(index=future_dates)
    X_future = create_features(future_df)

    forecast = reg.predict(X_future)

    last_hist_date = data_recent.index[-1]
    last_hist_close = data_recent['Close'].iloc[-1]

    plot_forecast_dates = pd.Index([last_hist_date]).append(future_dates)
    plot_forecast_values = np.concatenate(([last_hist_close], forecast))

    fig.add_trace(go.Scatter(
        x=data_recent.index,
        y=data_recent['Close'],
        mode='lines',
        name=f'Historical Market Close of {ticker}',
        line=dict(color=current_color)
    ))

    fig.add_trace(go.Scatter(
        x=plot_forecast_dates,
        y=plot_forecast_values,
        mode='lines',
        name=f'Predicted Future Close of {ticker}',
        line=dict(color=current_color, dash='dash')
    ))
    
fig.update_layout(
    title=f'Stock Close Price vs XGBoost Prediction',
    xaxis_title='Date',
    yaxis_title='Price',
    template='plotly_white'
)

fig.show()
Figure 7

Results

This section is an overview of the results from the previous analysis and forecasts. Because the markets are extremely volatile and many of the stocks, mostly notably ASML, have been skyrocketing in value lately, making forecasts is difficult, as some of the stocks have already been expected to fall according to most recent data.

From the first line chart Figure 1 and the market trend analysis you can see which companies have the strongest trends. ASML has been performing expectionally, but their stock value is experiencing a significant decrease. The other companies have similarly volatile stocks.

From the MACD Figure 3 and RSI Figure 4 indicator analysis it’s easy to see that the markets are very volatile. THe RSI plots vary heavily between oversold and overbought for most of the companies. This makes stock market analysis especially difficult and markets heavily exposed to speculation. One of the most ‘stable’ markets is that of AWEVF, partly due to it being a new company.

From the GARCH model Figure 5 and related statistical analysis, you can determine the most importand trends and qualities when it comes to market volatility. In this case, the most important plots to look at are the ‘estimated vs fixed volatility’ and forecast plots (remember to scroll to left to see the forecast). From these plots you can make decisions for risk management, banking regulations, and derivative management.

The statistical checks are somewhat optional but measures such as skewness from Figure 6 can provide significant details about the stock markets.

Finally the project the time series forecast model, XGboost Figure 7. As you can see from the graphs, the xgboost gives quite realistic forecast on the market close values. However for some of the tickers, you can see how the XGboost model might be too primitive and produce unrealistic forecasts.

The goal of this project has been to provide a wide variety of tools and models for stock market analysis and forecasting that can be applied to trading, investing, portfolio management, etc. The models should not be treated as firmly accurate, but instead as experimental models.